Word-based confidence measures as a guide for stack search in speech recognition

نویسندگان

  • Chalapathy Neti
  • Salim Roukos
  • Ellen Eide
چکیده

The Maximum a posteriori hypothesis is treated as the decoded truth in speech recognition. However, since the word recognition accuracy is not 100%, it is desirable to have an independent con dence measure on how good the maximum a posteriori hypothesis is relative to the spoken truth for some applications. E orts are in progress [1, 2, 3] to develop such con dence measures with the intent of applying it to assesment of con dence of whole utterances [4], rescoring of N-best lists, etc. In this paper, we explore the use of word-based con dence measures to adaptively modify the hypothesis score during search in continuous speech recognition: speci cally, based on the con dence of the current sequence of hypothesized words during search, the weight of its prediction is changed as a function of the con dence. Experimental results are described for ATIS and SwitchBoard tasks. About 8% relative reduction in word error is obtained for ATIS.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dynamic tuning of language model score in speech recognition using a confidence measure

Speech recognition errors limit the capability of language models to predict subsequent words correctly. An effective way to enhance the functions of the language model is by using confidence measures. Most of current efforts for developing confidence measures for speech recognition focus on applying these measures to the final recognition result. However, using these measures early in the sear...

متن کامل

Improved speech recognition using iterative decoding based on confidence measures

In this paper, a decoding method incorporating word-level conndence measures for improved speech recognition is presented. At rst, we focus on the estimation of conndence measures from the word graph and evaluate them in word graph rescoring (2nd-pass in 2-pass search system). Next, we propose the lexical tree search (1st-pass in 2-pass search system) incorporating the word-level conndence meas...

متن کامل

Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting

Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...

متن کامل

تخمین سریع ضرایب پیچش در هنجارسازی طول مجرای صوتی با استفاده از امتیاز به دست آمده از مدلسازی تشخیص جنسیت

The performance of automatic speech recognition (ASR) systems is adversely affected by the variations in speakers, audio channels and environmental conditions. Making these systems robust to these variations is still a big challenge. One of the main sources of variations in the speakers is the differences between their Vocal Tract Length (VTL). Vocal Tract Length Normalization (VTLN) is an effe...

متن کامل

Time-first search for large vocabulary speech recognition

This paper describes a new search technique for large vocabulary speech recognition based on a stack decoder. Considerable memory savings are achieved with the combination of a tree based lexicon and a new search technique. The search proceeds time-first, that is partial path hypotheses are extended into the future in the inner loop and a tree walk over the lexicon is performed as an outer loop...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997